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Elastic network models (ENMs) are valuable and efficient tools for characterizing the collective 
internal dynamics of proteins based on the knowledge of their native structures. The increasing evi¬ 
dence that the biological functionality of RNAs is often linked to their innate internal motions, poses 
the question of whether ENM approaches can be successfully extended to this class of biomolecules. 
This issue is tackled here by considering various families of elastic networks of increasing complexity 
applied to a representative set of RNAs. The fluctuations predicted by the alternative ENMs are 
stringently validated by comparison against extensive molecular dynamics simulations and SHAPE 
experiments. We find that simulations and experimental data are systematically best reproduced 
by either an all-atom or a three-beads-per-nucleotide representation (sugar-base-phosphate), with 
the latter arguably providing the best balance of accuracy and computational complexity. 


I. INTRODUCTION 

Gharacterizing the functional dynamics of RNA 
molecules is one of the key standing issues in molecu¬ 
lar biology. The interest in this topic is spurred by the 
ongoing discovery of ever new biological roles that RNAs 
can have in different contexts (see, e.g., [1] for a recent 
review) and, at the same time, by the realization that 
the structure —>■ function relationship of these molecules 
is often related to their internal dynamics [2]. In this 
respect, theoretical approaches hold much potential for 
complementing experiments and provide valuable quan¬ 
titative insight into the functional dynamics of RNAs. 
For instance, molecular dynamics (MD) simulations with 
atomistic force fields have been used to reproduce exper¬ 
imental measurements and aid their interpretation (see, 
e.g., Refs [^^). However, it may be argued that one of 
the most important limitations to the systematic use of 
atomistic MD simulations for characterizing the behav¬ 
ior of RNA is their intensive computational demand. In 
fact, most if not all current MD studies are still limited 
to the ps timescale. 

For this reason, several efforts are being spent towards 
developing coarse-grained approaches capable of striking 
a good balance between accuracy and computational ef¬ 
ficiency (see, e.g., refs [TUHH])- In this respect, it should 
be noted that coarse-grained models are valuable not 
only because they are amenable to extensive numerical 
characterization, but precisely because their simplified 
formulation can offer important insight into the main 
physico-chemical mechanisms that underpin the behavior 
and properties of a given biomolecule. 

For proteins, a successful class of such simplified mod¬ 
els are elastic networks. These models were originally 
motivated by the seminal work of Tirion m who showed 
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that the Hessian of the potential energy of a globular 
protein computed from an atomistic force field could 
be reliably reproduced by replacing the detailed inter¬ 
atomic forces by spring-like, harmonic interactions. This 
remarkable fact was rationalized a posteriori in terms 
of the large-scale character that low-energy fluctuations 
have in proteins, which makes them amenable to be cap¬ 
tured with models that are oblivious of the details of the 
potential |18H22j . This observation, in turn, prompted 
further development of simplified harmonic models where 
the structural descriptions themselves were simplified by 
reducing the number of interaction centers, also termed 
beads. In their simplest formulation, elastic-network 
models (ENM) incorporate harmonic interactions be¬ 
tween pairs of Cq, beads while two-beads 

amino acid representations, e.g. for the main- and side- 
chains [21) . can predict structural fluctuations in very 
good accord with atomistic MD simulations [23]. 

By comparison with proteins, the development and 
application of elastic networks aimed at nucleic acids 
is still relatively unexplored. Bahar and Jernigan first 
applied network models to the conformational dynam¬ 
ics of a transfer RNA using a model with two beads 
per nucleotide [25] . Several authors further simplified 
this model using a single bead placed on the phospho¬ 
rus atom [26ll32j . More recently, Setny and Zacharias 
suggested that the best candidate to host a single ENM 
bead is the center of the ribose sugar in the backbone 
[33] . Other ENMs with more beads per nucleotide have 
also been used |23l|27||32l|34]. Most of these studies as¬ 
sessed the validity of different representations by focus¬ 
ing on their capability to reproduce either the structural 
variability observed across experimental conformers or 
the Debye-Waller factors from X-ray experiments. ENM 
fluctuations were also compared with accurate atomistic 
MD simulations, but the comparison was either limited 
to short time scales |28| or to model simple double he¬ 
lices [55] . 

Towards the goal of identifying the most suitable RNA 
ENM, here we assess the performance of an extensive 
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repertoire of ENMs which are all equally viable a pri¬ 
ori. These models, in fact, differ for the specific single- 
or multi-bead representations used for each nucleotide, as 
well as for the spatial range of the pairwise elastic interac¬ 
tions. As stringent term of reference we perform ps time- 
scale atomistic MD simulations on RNA molecules con¬ 
taining canonical A-form double helices as well as non¬ 
trivial secondary and tertiary structures. Additionally, 
we introduce a procedure to compare fluctuations with 
selective 2'-hydroxyl acylation analyzed by primer exten¬ 
sion (SHAPE) experiments [3SJ[35]. SHAPE reactivity is 
empirically known to correlate with base dynamics and 
sugar pucker flexibility at the nucleotide level m and 
hence is, in principle, well suited for validating predic¬ 
tions of RNA internal dynamics. Recently, Kirmizialtin 
et al. have proposed a link between fluctuations of se¬ 
lected torsional angles and SHAPE reactivity and used 
SHAPE data as an input to improve the accuracy of 
force-field terms in an atomistic structure-based (Go-like) 
model [55] • However, to the best of our knowledge, the 
present study is the first attempt of using SHAPE re¬ 
activity measurements to assess the predictive accuracy 
of three-dimensional coarse-grained models or atomistic 
molecular dynamics simulations. 

We find that the best balance between keeping the 
model complexity to a minimum and yet have an accu¬ 
rate description of RNAs’ internal dynamics is achieved 
when each nucleotide is described by three beads repre¬ 
senting the sugar, the base, and the phosphate (SBP) 
groups. Slightly better results can be obtained using the 
much more computationally-demanding all-atom (AA) 
model. As a matter of fact, the SBP and AA elastic 
network models can reproduce to a very good accuracy 
the principal structural fluctuations as predicted from ps- 
long atomistic MD simulations, both in their directions 
and relative amplitudes. Additionally, they provide a 
satisfactory proxy for the nucleotide-level flexibility as 
captured by experimental SHAPE data. 


II. METHODS 
A. RNA dataset 

We performed atomistic MD simulations on four dif¬ 
ferent RNA molecules (Figure [^. These systems were 
chosen so as to cover a variety of size and structural com¬ 
plexity and yet be amenable to extensive simulations, as 
detailed in Table jlj 

The first entry is the NMR-derived model of the 
cucGUGAG RNA duplex, featuring two central G-U Wob¬ 
ble pairs [39]. As a second system, we considered the 
sarcin-ricin domain (SRD) from E.coli 23S rRNA, which 
consists of a GAGA tetraloop, a flexible region with a 
G-bulge and a duplex region [40]. The U nucleobase 
at the 5^ terminal was excised from the high resolution 
crystal structure. We further considered two more com¬ 
plex molecules: the hammerhead ribozyme m and the 



PDB code 

chain 

length 

simulation 
time (ps) 

Duplex 

lEKA 

16 

1.0 

Sarcin-ricin domain 

1Q9A 

25 

0.9 

Hammerhead ribozyme 301D 

41 

0.25 

add Riboswitch 

1Y26 

71 

0.25 

thiM Riboswitch 

2GDI 

78 

- 


TABLE I: RNA dataset: details and length of MD simula¬ 
tions. For the thiM riboswitch, no MD was performed. 



FIG. 1: Secondary strnctures of the four molecules studied: 
A: eight-base-pairs duplex; B: sarcin-ricin domain; C: ham¬ 
merhead ribozyme; D: add adenine riboswitch. 


add adenine riboswitch |42j . Both systems are composed 
of three stems linked by a three-way junction. In the 
add riboswitch, two hairpins are joined by a kissing loop 
interaction. All these systems, except for the duplex, 
were previously characterized by various computational 
means, including atomistic MD simulations [28ll43ti48] . 


B. Molecular dynamics simulations 


All MD simulations were performed using GROMACS 
4.6.7 [in] with the AMBER99 force field [SD] including 
parmbscO [5T] and xol 3 [52] corrections. GROMACS 
parameters can be found at http: //github. com/srnas/ 
ff. The trajectories were obtained in the isothermal- 
isobaric ensemble (T = 300 K, P = 1 atm) with stochas¬ 
tic velocity rescaling [53] and Berendsen barostat [54] . 
Long range electrostatics were treated using particle- 
mesh-Ewald summation |55] . The equations of motion 
were integrated with a 2 fs time step. All bond lengths 
were constrained using the LINGS algorithm [56]. Na'*’ 
ions were added in the box in order to neutralize the 
charge, and additional Cl“ and Na'*’ at a concentration 
of 0.1 M. AMBER-adapted parameters were used for Na'*' 
m and Cl [58]. The adenine ligand bound to the add 
riboswitch was parametrized using the general Amber 
force field (gaff) [55] and partial charges were assigned 
as discussed in reference [47]. The analyses of the ham¬ 
merhead ribozyme and of the add riboswitch trajectories 
were performed after discarding the first 10 ns and 5 ns, 
respectively. 
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C. Elastic Networks 

In elastic network models a simplified structural repre¬ 
sentation is achieved by representing any monomeric unit 
of the biopolymer with one or more beads. Accordingly, 
the model potential energy is equivalent to the one of a 
set of N beads connected by pairwise harmonic springs 
which penalize deviations of inter-bead distances from 
their typical, reference values. Thus, the elastic network 
does not directly restrain the absolute positions of the 
beads but only their distances. In the simplest formula¬ 
tion, the spring constant of the harmonic pairwise inter¬ 
action is set equal to a master spring constant k whenever 
the reference distance between the two beads is smaller 
than a pre-assigned interaction cutoff (Re), and set to 
zero otherwise. 

The potential energy of the system can be approxi¬ 
mated to second order as 

where the 3N x 3N symmetric matrix, M, is the Hessian 
of U, and is the /r Cartesian component of the devia¬ 

tion of bead i from its position in the reference structure. 


1. Repertoire of possible elastic networks for RNAs 

In protein contexts, the standard formulation of elas¬ 
tic network models is based on the intuitive amino acid 
representation with primary interaction centers located 
on the mainchain (e.g. the Cq, atoms) and possibly 
auxiliary ones for the sidechains [H]. By analogy with 
the case of proteins, one may expect that the primary 
ENM interaction centers could be the phosphate groups, 
which provide the backbone connectivity for single RNA 
strands PS1 - I5?| . Besides this possibility, we here investi¬ 
gated alternative representations considering all possible 
ENMs combinations based on the use of one or more in¬ 
teraction centers representing the three chemical groups 
of each nucleotide: the sugar, the base and phosphate 
(in short S, B and P, respectively). Each group is rep¬ 
resented by a specific atom, namely Cl' for the sugar, 
C2 for the base and P for the phosphate group. This 
choice is largely dictated for convenience of comparison 
with earlier studies [23l [32ll3^ . For each model the inter¬ 
action cutoff distance, Rc, is varied in the 3 — 30 A range 
with 1 A increments so as to assess the dependence of 
the predictions on the degree of connectivity of the elas¬ 
tic network. 


2. Reference structure 

For each RNA dataset entry, the reference structure for 
ENM calculations is set equal to the centroid structure 
of the associated MD trajectory. This is the conformer 
with the lowest average mean square distance from all 


MD-sampled structures after an optimal rigid structural 
alignment [60]. In the case of the add riboswitch, the 
adenine ligand atoms are included in the ENM calcula¬ 
tion. 


D. Comparison of ENMs and MD 

For a detailed and stringent comparison of ENM and 
MD we shall consider the covariance matrix, which pro¬ 
vides information on the structural fluctuations at equi¬ 
librium. The MD covariance matrix entries are defined 
as = {Sri^^Srj^^) , with - (r,,^)) 

where i and j run over the N indexed interaction cen¬ 
ters, and v run over the Cartesian components, and () 
denotes the time-average over the sampled conformations 
after an optimal structural superposition over the refer¬ 
ence structure. When comparing with a coarse-grained 
ENM, the structural alignment and the calculation of 
^MD i^oth performed by exclusively considering the 
same atom types used as beads in the elastic network 
model. For ENM, the covariance matrix is obtained from 
the pseudoinverse M~^ of the interaction matrix defined 
in Eq. as Here fcs is the Boltz¬ 

mann constant and T is the temperature. We observe 
that the ksT term is here required to allow the absolute 
covariance matrix to be properly related to the spring 
stiffness k. However, since in all the comparisons dis¬ 
cussed below we always consider a multiplicative term in 
the covariance matrix as a parameter for the fitting pro¬ 
cedures, the values of both ksT and k is never used in 
practice. 


Effective Interaction Matrix 


When comparing different ENMs one must consider 
only the modes related to the fluctuations of the de¬ 
grees of freedom in common between the models. To 
achieve this, it is necessary to separate the degrees of 
freedom of the beads of interest (with subscript a in 
the following) from the others (with subscript b in the 
following) and compute the effective interaction matrix 
of the former [221 [6TH6^ . This is accomplished by for¬ 
mally recasting the interaction in the following block 


form M = 


Ma 

w 

w'^ 

Mb 


where and are the in¬ 


teraction matrices of the two subsystems, while W rep¬ 
resents the interactions between them. The effective in¬ 
teraction matrix governing the dynamics of subsystem a 
alone is 


Mf = Ma- WM-^W^ (2) 

For a detailed derivation of this equation see [62]. Using 
this effective matrix one can compute the fluctuations 
relative to the subsystem considered. 
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1. Measures of similarity between essential spaces 

The comparison of the essential dynamical spaces of 
ENM and MD simulations is here carried out by con¬ 
sidering two quantities, namely the Pearson correlation 
of mean square fluctuations prohles and the similarity 
between the eigenspaces of covariance matrices. 

The mean square fluctuation (MSF) of a given cen¬ 
ter, i, can be obtained in the MD simulation by 
time-averaging the mean square displacements. Simi¬ 
larly, in ENMs they are given by MSF^ = {5rf) = 
We remark that the amplitudes of 
fluctuations are known to be inversely-correlated to the 
local density, that is the number of neighboring centers 
[M] . We also recall that the MSF profile is computed 
after carrying out an optimal global structural superpo¬ 
sition of all sampled conformers. As a consequence, the 
MSF of any given center depends not only on the local 
structural fluctuations but on the global intra-molecular 
ones too. 

The accord of two covariance matrices, A and B, can 
be measured more directly by comparing their essential 
dynamical spaces, identified by the set of their eigen¬ 
vectors {vyi} and {vb} and eigenvalues {Aa}, {A_b}- A 
stringent measure of this consistency is the root weighted 
square inner product (RWSIP) [SS] 


RWSIP = 








2 ^ 2=1 




(3) 


which takes on values ranging between 1, when the two 
ranked dynamical spaces coincide, and 0, when they are 
completely orthogonal. 

The statistical significance of both the MSF correla¬ 
tion and the RWSIP is assessed by using two terms of 
reference. The first one is given by the degree of consis¬ 
tency of the MSF or RWSIP for first and second halves of 
the atomistic MD trajectories. This sets, in practice, an 
upper-limit for very significant correlations of the observ¬ 
ables. The second one is the degree of consistency of the 
random elastic network (RNM) of Setny et a/. [33] with 
the reference MD simulations. This is a fully-connected 
elastic network where where all pairs of beads interact 
harmonically though, for each pair, the spring constant 
is randomly picked from the [0,1] uniform distribution. 
Because this null ENM does not encode properties of 
the target molecule in any meaningful way, it provides a 
practical lower bound for significant correlations between 
ENMs and MD simulations. 


E. Comparison with SHAPE data 

To compare the fluctuations from both ENMs and MD 
simulations with data from SHAPE experiments we here 
scrutinize several order parameters that, a priori could 
be viable proxies for SHAPE reactivity data, namely: 


i) the variance of the distance between selected pairs of 
beads and ii) the variance of the angle between selected 
triplets of beads. The variance of each distance and an¬ 
gle as obtained from MD was compared with the SHAPE 
reactivity of the corresponding nucleotide for the add ri- 
boswitch taken from ref [66]. Distances and angles were 
computed using PLUMED [57] . 

In the ENM framework, the variance of the distance 
between two beads can be directly obtained from the co- 
variance matrix in the linear perturbation regime as 




44 ,r 

j, iCu. 




'^33,n^ ^i3,nv (4) 


where dF is the /r Cartesian component of the reference 
distance between bead i and j. 

When comparing ENM and SHAPE we also consid¬ 
ered the experimental data relative to the thiM thiamine 
pyrophosphate riboswitch published in ref |66| . For this 
molecule no reference MD simulation was performed and 
ENMs were computed directly on the crystal structure 
(PDB code: 2GDI) [68]. 


III. RESULTS 

For the comparative validation against MD and 
SHAPE data we consider eight different types of elas¬ 
tic networks, as summarized in Table [ll] A subset of the 
considered models have been previously used in differ¬ 
ent contexts [53] 1351 - 133] . With the exception of the all¬ 
atom (AA) model, all other ENMs will be referred to with 
the one, two and three-letter acronyms corresponding to 
which of the phosphate (P), sugar (S), or base (B) inter¬ 
action centers are used, see Fig. [^ We also tested ENMs 
with a higher number of beads (see Fig. SD 1 for an ex¬ 
ample). All the considered ENMs feature a sharp-cutoff 
interaction scheme (as explained in the section Methods). 
Using a distance-dependent elastic constant yields simi¬ 
lar results (Fig. SD 2 for details). 


A. Comparison of ENMs and MD 

The consistency of ENM and MD simulations was as¬ 
sessed by computing the Pearson correlation coefficient 
(i?) for the MSF profiles and the RWSIP for the essen¬ 
tial dynamical spaces. To keep the comparison as simple 
and transparent as possible, each measure was computed 
separately for the S, B and P interaction centers. For 
multi-center ENMs this required the calculation of the 
effective interaction matrix (Eq. . Using as a reference 
the experimental structure in place of the MD centroid 
introduces only minor differences in the results, see SD 3. 
Each measure was then averaged over the four systems in 
Table(see SD 4 for non-averaged values). The results, 
shown in Fig. [^ are profiled as a function of the elas¬ 
tic network interaction cutoff distance, Re- The smallest 
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FIG. 2: Schematic representation of the beads used to con¬ 
struct the ENM. The three atoms used as beads are the C2 
carbon in the base, the Cl' carbon in the sngar ring, and the 
P atom in the phosphate group, as indicated by labels. 


physically-viable value for iic, that is the abscissa of the 
left-most point of the curves, is the minimum value en¬ 
suring that the ENM zero-energy modes exclusively cor¬ 
respond to the six roto-translational modes. 

The main feature emerging from Fig. [^is that, across 
the various models, the highest consistency with MD is 
attained when Rc is marginally larger than its smallest 
physically-viable value. It is also noted that the mini¬ 
mum value of Rc varies significantly across the models: 
for the AA model, which is the most detailed ENM, it 
is as low as 4 A, while for the single-bead ones it is of¬ 
ten larger than 10 A. The MSF and RWSIP accord both 
decrease systematically as Rc is increased starting at the 
optimal value. This fact, which to our knowledge has 
not been reported before, can be rationalized a posteriori 
by considering that upon increasing i?c, one endows the 
network with harmonic couplings among nucleotides that 
are too far apart to be in direct physical interaction, and 
this brings about a degradation in model performance. 

Furthermore, it is noted that the detailed, but also 
computationally more onerous, AA model is consistently 
in better accord with MD data than any of the coarse¬ 
grained ENMs. For this model, the degree of ENM-MD 
consistency is practically as high as the internal MD con¬ 
sistency at the optimal value ~ 7 A, or even higher 
in some cases. As a general trend, we notice that the ac¬ 
cord between MD and ENMs decreases for coarser models 
(see also Fig. SD 5 for models including two beads per 
nucleotide). Importantly, the AA and SBP models per¬ 
form well not only on average but for each considered 
structure, whereas the performance of models with fewer 
interactions centers is less consistent across the repertoire 
of RNA molecules, see Fig. SD 4. For all models, con¬ 
sidering the optimal value of Rc both MSF and RWSIP 


accord are significantly higher than for the null model, 
indicating that all the ENMs are overall capable to cap¬ 
ture the salient physical interactions of the system. 

It is important to mention here that in the MD simu¬ 
lation of the duplex we observed a fraying event at time 
« 670 ns (see Fig. SD 6), followed by a re-zipping into 
the native structure. As a matter of fact, fraying events 
are expected at RNA termini on the ps time-scale cov¬ 
ered by our simulations [M]. In spite of the fact that 
these events are clearly out of the linear perturbation 
regime where one would expect ENM to properly pre¬ 
dict fluctuations, the correlation between MD and ENM 
is reasonably high. By removing from the analysis the 
highly fluctuating terminal base pairs, the correlation is 
further improved (Fig. SD 7). 

In Table [lT| we summarize all the results for the optimal 
cut-off radius, determined as the radius that maximizes 
the RWSIP. The last column of the table reports the av¬ 
erage number of neighbors of a bead, that is the number 
of other beads at distance smaller than Rc from it. 


ENM Cl' 

C2 

P 

others 

best number of 
Rc (A) neighbors 

P 



/ 


20 

15.3 

S 

/ 




15 

9.9 

B 


/ 



17 

14.8 

SP 

/ 


/ 


19 

30.4 

BP 


/ 

/ 


18 

29.9 

SB 

/ 

/ 



11 

15.4 

SBP 

/ 

/ 

/ 


9 

12.0 

AA 

/ 

/ 

/ 

/ 

7 

52.9 


TABLE II: Summary of the tested ENMs. For each model, 
the adopted beads are marked. AA include all heavy atoms. 
Values of the cutoff radius {Rc) that maximize the RWSIP 
and average number of neighbors are also shown. 


1. Effect of ionic strength 

One standing question for RNAs, that is relevant also 
for ENM development (32], is whether and how the in¬ 
ternal dynamics of these biomolecules is affected by the 
concentration and type of counterions in solutions. These 
parameters, in fact, modulate the screening of the elec¬ 
trostatic self-repulsion of RNA backbone and are indeed 
often used to artificially induce RNA unfolding. Because 
current formulations of ENMs, including those consid¬ 
ered here, do not explicitly account for electrostatic ef¬ 
fects, and thus intrinsically provide results that are inde¬ 
pendent of the ionic strength, it is important to ascertain 
to what extent changes of ionic strength would affect the 
collective internal dynamics of the considered RNAs. 

To clarify this point, we carried out MD simulations 
at different nominal concentrations of monovalent salt 
Na+/Cl“. The consistency of the essential dynamical 
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FIG. 3: Agreement between MD simulations and ENM for different radii of cutoff. Correlation between MSF (upper panels), 
and RWSIP (lower panels). Values at the optimal cutoff values are represented by circles. A: phosphate beads; B: sugar beads; 
C: nucleobase beads. The gray regions correspond to values below the random-network model or above the MD self-agreement. 


spaces observed in simulations based on different salt con¬ 
centrations was measured with the RWSIP. Only the C2, 
Cl' and P atoms were considered for computing the es¬ 
sential dynamical spaces. 

Molecule 0.0 M 0.1 M 0.5 M 1.0 M 
Duplex 0.938 0.998 0.991 0.990 
SRD 0.983 0.983 0.982 0.993 

TABLE III: RWSIP between 100 ns trajectories at different 
NaCl concentrations and the 500 ns trajectory at 0.1 M. For 
the duplex, only the first half of the 1 ps trajectory was con¬ 
sidered, thus discarding the contribution of the base fraying 
event (see SD 6). 

As summarized in Table m the essential dynamical 
spaces are very consistently preserved over a wide range 
of ionic strengths. This finding complements a recent 
study of Virtanent et al. [ZD] where the electrostatic 
free energy was shown to be minimally affected by ionic 
strength. In the present context, the result justifies the 
use of RNA elastic networks with no explicitly treatment 
of the ionic strength. It is however important to note 
that our test was limited to monovalent cations. The 
treatment of divalent cations is known to be very chal¬ 
lenging because of force-field limitations and sampling 
difficulties. 

We finally notice that in our simulations with standard 
AMBER ions we did not observe any ion-crystallization 
event m- For maximum robustness we tested the alter¬ 


native ion parameterization by Joung and Cheatham [72j . 
obtaining very similar results. 


B. Comparison with SHAPE data 

To complement the validation of ENM against MD, 
we assessed their consistency with experimental data 
too. To this purpose we considered data obtained from 
SHAPE experiments, which probe RNA structural fluc¬ 
tuations at the nucleotide level m- One standing chal¬ 
lenge is that it is not yet settled which simple structural 
or dynamical observables can be used as viable proxies 
for the SHAPE intensities. To tackle this elusive prob¬ 
lem, we first set out to analyze the MD simulations so 
as to identify the local fluctuations that best correlate 
with SHAPE data. Specifically, we compared our MD 
simulation and available SHAPE data for the add ri- 
boswitch [DDj. A related comparison based on B-factors 
profiles, which are commonly used to validate ENM pre¬ 
dictions (albeit with known limitations [24] ) is provided 
in fig. SD 8. 

As it emerges from Fig. j^, the best correlation with 
experimental SHAPE reactivity was found for the fluc¬ 
tuations of the distance between consecutive C2 atoms 
{R — 0.88). This is remarkable, since the SHAPE reac¬ 
tion does not explicitly involve the nucleobases. These 
fluctuations are shown, as a function of the residue in¬ 
dex, in Figure The result can be interpreted by con- 
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sidering that most of the structural constraints in RNA 
originates from base-base interactions, and fluctuations 
in base-base distance are required for backbone flexi¬ 
bility. The fluctuations of the angle 02'-P-05' instead 
showed a poor correlation with experimental SHAPE 
data (i? = 0.05). We notice here that the value of this 
angle has been shown to correlate with RNA stability 
related to in-line attack and its fluctuations were 
recently used in the SHAPE-FIT approach to optimize 
the parameters of a structure-based force-held using ex¬ 
perimental SHAPE reactivities [35] ■ We also observe 
that the huctuations of the distance between consecu¬ 
tive C2 atoms could be correlated with ribose mobility, 
which in turn depends on sugar pucker [niiisi. Inter¬ 
estingly, C2'-endo conformations have been shown to be 
overrepresented among highly reactive residues in the ri¬ 
bosome [37]. An histogram of C2-C2 distances for se¬ 
lected sugar puckers is shown in Fig. SD 9, indicating 
that C2'-endo conformations correspond to a larger vari¬ 
ability of the C2-C2 distance. In conclusion, although the 
scope of the present SHAPE prohles comparison could be 
affected by the limited accuracy or precision of both ex¬ 
perimental and MD-generated data, the obtained results 
suggest that a good structural determinant for SHAPE 
reactivity is arguably provided by base-base distance huc¬ 
tuations. In Fig. SD 10 we show this comparison using a 
non-parametric measure of correlation. 

Based on this result, we next quantihed to which ex¬ 
tent the ENMs are able to reproduce the prohle of huctu¬ 
ations of the C2-C2 distance. This test complements the 
assessment made using MSF and RWSIP, which mostly 
depends on the agreement of large scale motions and does 
not imply a good performance in the prediction of local 
huctuations. This comparison is presented in Figure [^ 
where the ENM-MD Pearson correlation coefficients for 
each considered ENM are summarized. We remark here 
that the duplex (lEKA) is undergoing a base fraying, so 
that MD exhibits very large huctuations at one terminus 
(see Fig. SD 6). The overall accord between MD and 
ENM is moderately good, although signihcantly worse 
than the accord with the large scale motions presented 
before. Overall, it is seen that the both the SBP model 
and the AA models provide the best agreement. 

In the following, we thus test whether the SBP and 
AA models are capable of reproducing SHAPE reactivi¬ 
ties directly, without the need for an expensive MD sim¬ 
ulation to be performed. ENM and SHAPE data were 
compared for two different molecules, namely the afore¬ 
mentioned add riboswitch and the thiM riboswitch. As 
we can see from Figure [^ the predictions of ENM are 
in qualitative agreement with the SHAPE data. In par¬ 
ticular, high SHAPE reactivity in the loop and junction 
regions correspond to highly fluctuating beads, both for 
the add and thiM riboswitch. We notice that this agree¬ 
ment goes beyond the mere identification of the residues 
involved in Watson-Crick or wobble pairings US], as there 
appear several unpaired bases with a low SHAPE reactiv¬ 
ity. This feature seems to be often correctly reproduced 



FIG. 4: A: Pearson correlation coefficient R, computed be¬ 
tween SHAPE reactivities and the fluctuations of different 
distances (light grey), and angles (dark grey), computed from 
the MD trajectory of the add riboswitch. Residue indexes 
are shown in Fig. SD 10; B: correlation between the fluctua¬ 
tions of the distance of consecutive C2 atoms, from the MD 
simulation and from the different ENMs. 


by the C2-C2 fluctuations profile. By visual inspection, 
it can be seen that non-reactive, non-paired bases often 
engage non-Watson-Crick base pairs as well as stacking 
interactions, as shown in Fig. SD 11. The Pearson cor¬ 
relation coefficients are summarized in Table ITVl In this 
case too, it is found that the AA ENM performs bet¬ 
ter than the SBP ENM which, nevertheless, is much less 
demanding computationally because of its simpler for¬ 
mulation. 

Molecule SBP AA MD 
add 0.64 0.76 0.88 

thiM 0.37 0.59 - 

TABLE IV: Pearson correlation coefficients between C2-C2 
fluctuations predicted by ENM/MD and SHAPE reactivities. 
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FIG. 5: Comparison of the flexibility of the add riboswitch 
(upper panel) and the thiM riboswitch (lower panel). The 
SHAPE reactivities (black) are compared with the C2-C2 
fluctuations predicted by the SBP and the AA models. For 
the add riboswitch, also fluctuations from MD are shown. 
Regions corresponding to residues forming Watson-Crick or 
wobble base pairings are shown in gray. 


IV. DISCUSSION 


The development and performance assessment of elas¬ 
tic networks for RNAs have so far been pursued in two 
main directions. On one hand, Zimmermann and Jerni- 
gan [32] have recently shown that the essential dynamical 
spaces of ENMs based on the phosphate representation of 
RNAs can satisfactorily account for the structural vari¬ 
ability observed across crystal structures homologs. On 
the other hand, Setny and Zacharias [33] have consid¬ 
ered ENMs where different atoms of the RNA backbone 
(i.e. sugar and phosphate groups only) are alternatively 
used to represent nucleotides in short RNA duplexes. 
Within this class of single-bead ENMs and target RNA 
structures, it was found that those based on the sugar- 
group representation yielded the structural fluctuations 
with the best consistency with MD simulations or NMR 
ensembles [33] . 

Here, we tackle this standing challenge by searching 
for the simplest and yet accurate RNA ENM. We ana¬ 
lyze a comprehensive combinations of (i) interaction cen¬ 


ters, or beads, for each nucleotide and (ii) spatial range 
of the elastic interaction. In total, we considered eight 
different types of ENMs, which are listed in Table [Tlj For 
the critical assessment of their performance, we validated 
the predicted structural fluctuations against data from 
[is-long atomistic MD simulations as well as from exper¬ 
imental SHAPE measurements. Finally, towards ensur¬ 
ing model transferability, we considered the four different 
types of RNA molecules listed in Table [I] and represented 
in Fig. [l] These systems cover a significant repertoire of 
different structural elements such as non-canonical base 
pairs, bulges, junctions and tertiary contacts and were 
selected with two main criteria, namely: first, they na¬ 
tively adopt a specihc fold (i.e. have a stable tertiary 
structure, which is a prerequisite for ENM applicability) 
and, secondly, they are amenable to extensive numeri¬ 
cal characterization with ps-long MD simulations in ex¬ 
plicit solvent. We notice that the size of the studied sys¬ 
tems is limited only by the MD computational cost, while 
the ENM method is straightfowardly applicable to larger 
molecules, as it has been done for instance in Ref. [27] . 

In the following we discuss the performance of the var¬ 
ious models listed in Table [IT] starting from those em¬ 
ploying a single-bead nucleotide representation and then 
moving on to the more detailed, multi-bead ones. 

Among the one-bead models the best accord with MD 
data is obtained for the S model, where a nucleotide is 
represented with the Cl' atom of the sugar moiety. In 
this case, when the most appropriate elastic interaction 
range is used (see Table 0 , the accord of ENM and MD 
is significantly larger than the statistical reference (null) 
case, and not too much behind the accord of the first 
and second halves of the MD simulations. This result 
is consistent with the conclusions of the aforementioned 
recent study of Ref. [33] and reinforces them from a sig¬ 
nificantly broader perspective. In fact, the present as¬ 
sessment is carried out for a wider range of RNA motifs 
and the search of the optimal representative atom is not 
limited to the RNA backbone but encompasses the base 
too. 

In this regard, we note that the model with a single 
bead on the C2 atom of the base (B model) reproduces 
structural fluctuations less accurately than the S model 
and the optimal interaction cutoff is more dependent on 
the specihc molecule, a fact that impairs the transfer- 
ability of the model. These shortcomings are even more 
evident in the P model, where a nucleotide is represented 
with the sole phosphorous atom. In fact, both the S and 
B models are better performing than the P one. The re¬ 
sult may be, at hrst, surprising because of the apparent 
analogy between the phosphate representation in RNA 
and the Cq, representation in proteins. The latter is vir¬ 
tually used in all single-beads ENMs for proteins. How¬ 
ever, one should keep in mind a fundamental distinction 
of backbone and side-groups roles for the structural or¬ 
ganization and stability of these two types of biopoly¬ 
mers. In fact, whereas for proteins the backbone self¬ 
interaction (e.g. hydrogen bonding) contributes signifi- 
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cantly to the structural stability, for RNAs the analogous 
role is, in fact, played by the bases and not by the phos¬ 
phate groups [771 HE]- In this regard, it is interesting 
to recall that RNAs have, in fact, been interpreted as 
adopting an “inside-out” organization compared to pro¬ 
teins m- This distinction might help rationalize why the 
P representation does not serve for RNA ENMs equally 
well as the Cq, representation for proteins. 

Moving on to two-beads models, we observe that 
ENMs employing beads both in the bases and in the 
backbone (SB, BP) perform systematically better than 
any single-bead model with only a modest increase in 
the computational complexity. SB and BP models also 
outperforms the SP model. We also stress that being 
able to reproduce the fluctuations of the bases is by itself 
an advantage because their functional role is of primary 
importance in nucleic acids and their dynamics can affect 
different aspects of the behavior of RNA molecules (see, 
e.g.. Refs. P IMIITE] ). 

Increasing the number of beads featured in the ENM 
models (see also Eig. SD 1 for 5/6-beads model), im¬ 
proves the agreement with MD, consistently with what 
had been observed for proteins |5D]. The best overall 
accuracy is indeed observed for the all-atom (AA) ENM. 
We focused our attention on this model, as well as on the 
the SBP model, that uses one bead for each of the sugar, 
base and phosphate groups. In fact, the consistency of 
both models with MD data is practically as high as the 
internal consistency of MD itself. We also note that the 
optimal performance of the SBP model is attained when 
the interaction cutoff distance is about equal to 9 A. This 
is a convenient feature, as this interaction range falls in 
the same viable interaction range of elastic networks for 
proteins mi mi- Furthermore, the typical density of 
beads in protein ENM is very similar to the SBP model 
(Table |n]). In principle, this allow for the perspective of 
integration of proteins and RNA elastic networks to study 
protein/RNA complexes. 

The viability of the SBP and AA models is indepen¬ 
dently underscored by the comparison against experi¬ 
mental SHAPE data, which are notoriously challenging 
to predict. The challenge is at least partly due to the dif¬ 
ficulties of identifying from a priori considerations struc¬ 
tural or dynamical observables that correlate significantly 
with SHAPE data. As a first step of the analysis we 
therefore considered various observables computed from 
atomistic MD simulations against SHAPE data, and es¬ 
tablished that the relative fluctuations of consecutive nu- 
cleobases provide a viable proxy for SHAPE data. Our 
comparative analysis showed that such fluctuations can 
be captured well using the SBP ENM, and to an even 
better extent with the AA ENM. Possibly, this is a step 
in the direction of dehning a model able to directly cor¬ 
relate three-dimensional structures with SHAPE reactiv¬ 
ities. Interestingly, both the ENMs are completely in¬ 


dependent from the dihedral potentials and thus should 
not be directly affected by the pucker conformation of the 
ribose. The fact that they can provide a reasonable esti¬ 
mate of the backbone flexibility as measured by SHAPE 
reactivity suggests that the backbone flexibility is mostly 
hindered by the mobility of the bases. 

In conclusion, elastic network models were here com¬ 
pared systematically with fully atomistic molecular dy¬ 
namics simulations and with SHAPE reactivities. We 
found that, in spite of their simplistic nature, the three- 
center (SBP) and all-atom (AA) elastic networks are ca¬ 
pable of properly reproducing both MD fluctuations and 
chemical probing experimental data. Of these two accu¬ 
rate ENMs, the three-center model (SBP), provides an 
ideal compromise between accuracy and computational 
complexity, given that retaining the full atomistic detail 
when modeling large structures, such as the ribosome 
and other macromolecular RNA/protein complexes, can 
be computationally very demanding. 

A module that implements the ENM for RNA dis¬ 
cussed in this paper has been included in the baRNAba 
analysis tool (http://github.com/srnas/barnaba). 
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